Time-dependent Cross-probability Model for Feature Vector Normalization
نویسندگان
چکیده
In previous works, Multi-Environment Model based LInear Normalization, MEMLIN, and Phoneme-Dependent MEMLIN, PD-MEMLIN, were presented and they were proved to be effective to compensate environment mismatch. Both are empirical feature vector normalization techniques which model clean and noisy spaces with Gaussian Mixture Models, GMMs, and the probability of the clean model Gaussian, given the noisy model one and the noisy feature vector (cross-probability model) is a critical point in both algorithms. In the previous works the cross-model probability was approximated as time-independent. However, in this paper, a time-dependent estimation based on GMM is proposed for MEMLIN and PD-MEMLIN. Some experiments with SpeechDat Car database were carried out in order to study the performance of the proposed estimation of the cross-probability model in a real acoustic environment, obtaining important improvements: 78.48% and 76.76% of mean improvement in Word Error Rate, WER, for MEMLIN and PD-MEMLIN, respectively (70.21% and 75.44% if timeindependent cross-probability model is applied).
منابع مشابه
Time-dependent cross-probability model for multi-environment model based LInear normalization
In a previous work, Multi-Environment Model based LInear Normalization, MEMLIN, was presented and it was proved to be effective to compensate environment mismatch. MEMLIN is an empirical feature vector normalization which models clean and noisy spaces by Gaussian Mixture Models (GMMs). In this algorithm, the probability of the clean model Gaussian, given the noisy model one and the noisy featur...
متن کاملIntegrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis
State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In th...
متن کاملIntegrated Feature Normalization and Enhancement for Robust Speaker Recognition Using Acoustic
State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In th...
متن کاملUnsupervised training scheme with non-stereo data for empirical feature vector compensation
In this paper, a novel training scheme based on unsupervised and non-stereo data is presented for Multi-Environment Modelbased LInear Normalization (MEMLIN) and MEMLIN with cross-probability model based on GMMs (MEMLIN-CPM). Both are data-driven feature vector normalization techniques which have been proved very effective in dynamic noisy acoustic environments. However, this kind of techniques ...
متن کاملA recursive feature vector normalization approach for robust speech recognition in noise
The acoustic mismatch between testing and training conditions is known to severely degrade the performance of speech recognition systems. Segmental feature vector normalization [8] was found to improve the noise robustness of MFCC feature vectors and to outperform other state-of-the-art noise compensation techniques in speaker-dependent recognition. The objective of feature vector normalization...
متن کامل